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(N . Abstract 
> 

^ 1 ' Network coding is a new technique to transmit data through a network by letting the intermediate 

nodes combine the packets they receive. Given a network, the network coding solvability problem decides 
whether all the packets requested by the destinations can be transmitted. In this paper, we introduce a 
new approach to this problem. We define a closure operator on a digraph closely related to the network 
coding instance and we show that the constraints for network coding can all be expressed according to 
that closure operator. Thus, a solution for the network coding problem is equivalent to a so-called solution 
of the closure operator. We can then define the closure solvability problem in general, which surprisingly 
reduces to finding secret-sharing matroids when the closure operator is a matroid. Based on singular 



properties of the closure, we are able to generalise the way in which networks can be split into two 



distinct parts. We then investigate different properties of closure operators, thus yielding bounds on the 



entropy of any solution. Also, the guessing graph approach to network coding solvability is generalised 
to any closure operator, which yields bounds on the entropy of possible network codes. We finally prove 
that any nontrivial multiple unicast with two source-receiver pairs is always solvable over all sufficiently 
large alphabets; however, there are arbitrarily narrow bottlenecks when any closure is considered. 

I. Introduction 

Network coding HI is a protocol which outperforms routing for multicast networks by letting the 
intermediate nodes manipulate the packets they receive. In particular, linear network coding [2] is optimal 
in the case of one source; however, it is not the case for multiple sources and destinations 0, 131. 
Although for large dynamic networks, good heuristics such as random linear network coding Q, 
can be used, maximizing the amount of information that can be transmitted over a static network is 
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fundamental but very hard in practice. Solving this problem by brute force, i.e. considering all possible 
operations at all nodes, is computationally prohibitive. In this paper, we provide a new approach to tackle 
this problem based on a closure operator defined on a related digraph. Closure operators are fundamental 
and ubiquitous mathematical objects. 

The guessing number of digraphs is a concept introduced in Q, which connects graph theory, network 
coding, and circuit complexity theory. In Q it was proved that an instance of network coding with r 
sources and r sinks on an acyclic network (referred to as a multiple unicast network) is solvable over a 
given alphabet if and only if the guessing number of a related digraph is equal to r. Moreover, it is proved 
in 0, HI that any network coding instance can be reduced into a multiple unicast network. Therefore, the 
guessing number is a direct criterion on the solvability of network coding. One of the main advantages 
of the guessing game approach is to remove the hierarchy between sources, intermediate nodes, and 
destinations. In |9), the guessing number is also used to disprove a long-standing open conjecture on 
circuit complexity. In [10], the guessing number of digraphs was studied, and bounds on the guessing 
number of some particular digraphs were derived. The guessing number is also equal to the so-called 
graph entropy Q, ifTTI . This allows us to use information inequalities lfl2l to derive upper bounds on 
the guessing number. 

In |fT3ll , a graph on all the possible configurations of a digraph is introduced, and is referred to as 
the guessing graph. The guessing number of a digraph is equal to the logarithm of the independence 
number of its guessing graph. In terms of the nature of the problem and its search space, solvability of 
network coding is no longer about determining the appropriate operations at each intermediate node, but 
is now about the possible messages that could be transmitted through the network. The operations which 
transmit these messages can then be easily determined. In terms of complexity, the problem of solvability 
of network coding is reduced to a decision problem on the independence number of undirected graphs. 

Shamir introduced the so-called threshold secret sharing scheme in 1141 . Suppose a sender wants to 
communicate a secret a G A to n parties, but that an eavesdropper may intercept r — 1 of the transmitted 
messages. We then require that given any set of r — 1 messages, the eavesdropper cannot obtain any 
information about the secret. On the other hand, any set of r messages allows to reconstruct the original 
secret a. The elegant technique consists of sending evaluations of a polynomial p(x) = Y%=oPi x% ' w i tn 
Po = a and all the other coefficients chosen secretly at random, at n nonzero elements of A. The threshold 
scheme was then generalised to ideal secret sharing schemes with different access structures, i.e. different 
sets of trusted parties. Brickell and Davenport have proved that the access structure must be the family 
of spanning sets of a matroid; also any linearly representable matroid is a valid access structure |fl31 . 
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However, there exist matroids (such as the Vamos matroid |[T6lD which are not valid access structures. For 
a given access structure (or equivalently, matroid), finding the scheme is equivalent to a representation 
by partitions 1 1 7]. 

In this paper, we introduce a closure on digraphs, and define the closure solvability problem for any 
closure operator. This yields the following contributions. 

• First of all, this framework encompasses network coding and ideal secret sharing. In particular, 
network coding solvability is equivalent to the solvability of the closure of a digraph associated to 
the network. This framework then allows us to think of network coding solvability on a higher, more 
abstract level. The problem, which used to be about coding functions, is now a simplified problem 
about partitions. The relations with matroids unveiled in |fl8l , |[T9l are also clarified. 

• This approach is particularly elegant, in different aspects. Firstly, the adjacency relations of the graph, 
and hence its topology of the network, are not visible in the closure. Therefore, the closure filters out 
some unnecessary information from the graph. Secondly, it is striking that all along the paper, most 
proofs will be elementary, including those of far-reaching results. Thirdly, this framework highlights 
the relationship with matroids, via natural concepts such as flats and spans, which are new to the 
author's knowledge. 

• Like the guessing game approach, the closure approach also gets rid of the source-intermediate 
node-destination hierarchy. Then the guessing graph machinery of lfT3l can be easily generalised to 
any closure operator. In other words, the interesting aspects of the guessing game approach can all 
be recast and generalised in our framework. 

• This approach then yields interesting results. First, we determine bounds on the entropy of a solution, 
which are combinatorial in essence and directly follow from the closure operator. This provides tight 
constraints on the shape of solvable closures. Second, it was shown in [13] that the entropy of a 
digraph is equal to the sum of the entropies of its strongly connected components. Thus, one can split 
the solvability problem of a digraph into multiple ones, one for each strongly connected component 
fT3l . In this paper, we extend this way of splitting the problem by considering the closures induced 
by the subgraphs. We can easily exhibit a strongly connected digraph whose closure is disconnected, 
i.e. which can still be split into two smaller parts. More specifically, if the graph is strongly connected 
but its closure is disconnected, then we can exhibit a set of vertices which are simply useless and 
can be disregarded for solvability. Third, we can prove that any digraph whose closure has rank two 
is solvable. This means that any multiple unicast with two source -receiver pairs is solvable, unless 
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there exists an easily spotted bottleneck in the network. 
The rest of the paper is organised as follows. In Section HH we review some useful background. In 
Section Jill we define the closure solvability problem. We then prove that network coding solvability is 
equivalent to the solvability of a closure in Section [TV] We investigate the properties of closure operators 
and provide bounds on the entropy of any solution in Section [V] and investigate how to combine closure 
operators in Section |VT] We then define the solvability graph in I VII I and determine a new way to split 
the solvability problem. Closures with rank two are finally studied in Section I VIII I 

II. Preliminaries 

A. Closure operators 

Throughout this paper, V is a set of n elements. A closure operator on V is a mapping cl : 2 — > 2 V 
which satisfies the following properties ll20l Chapter IV]. For any X, Y C V, 

1) IC cl(X) (extensive); 

2) if X C Y, then d(X) C cl(T) (isotone); 

3) cl(cl(X)) = cl(X) (idempotent). 

A closed set is a set equal to its closure. For instance, in a group one may define the closure of a set as 
the subgroup generated by the elements of the set; the family of closed sets is simply the family of all 
subgroups of the group. Another example is given by linear spaces, where the closure of a set of vectors 
is the subspace they span. 

The closure satisfies the following properties. For any X, Y C V, 

1) cl(X) is equal to the intersection of all closed sets containing X; 

2) cl(cl(X) n cl(y)) = cl(X) n cl(Y), i.e. the family of closed sets is closed under intersection; 

3) cl(X UY) = cl(cl(X) U cl(Y)). 

4) X C cl(Y) if and only if cl(X) C cl(Y). 
We refer to 

r :=min{|6| : cl(6) = V} 

as the rank of the closure operator. Any set b C V of size r and whose closure is V is referred to as a 
basis of cl. 

An important class of closures are matroids ETTl . which satisfy the Saunders-Mac Lane axiom: if 
X C V, v € V and u G cl(X U v)\cl(X), then v G cl(X U u). A special class consists of the uniform 
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matroids, typically denoted as U r>n , where 

\V if|X|>r 

U r , n (X) = I 

I X otherwise. 

Clearly, C/ r n has rank r. 

B. Functions and their kernels 

While network coding typically works with functions assigned to vertices, it is elegant to work with 
partitions. Recall that a partition of a set B is a collection of subsets, called parts, which are pairwise 
disjoint and whose union is the whole of B. Any function / : B — > C has a kernel denoted as / := 
{/ (c) : c G f(B)}, defined by the partition of B into pre-images under /. Conversely, any partition 
of B in at most |C| can be viewed as the kernel of some function from B to C. Note that two functions 
/, g have the same kernel if and only if / = tt o g for some permutation tt of C. Note that the kernel 
of any injective function B — > C is the so-called equality partition Eb of B (i.e. the partition with \B\ 
parts). We denote the parts of a partition / as Pi(f) for all i. 

If any part of / is contained in a unique part of g, we say / refines g. The equality partition refines any 
other partition, while the universal partition (the partition with one part) is refined by any other partition. 

The common refinement of two partitions f,gofB is given by h := f V g with parts 

P M (h) = {Pi(f) n Pj(g) : P(f) n Pj(g) ± 0}. 

We shall usually consider a tuple of n partitions / = (/i, . . . , f n ) assigned to elements of a finite set 
V with n elements. In that case, for any X C V , we denote the common refinement of all f v , v € X as 

fx ■= V^ex U For an y S, T C V we then have f SuT = fs V fr- 

C. Digraphs 

Throughout this paper, we shall only consider digraphs [22] with no repeated arcs. We shall denote 
the arc set as E(D), since the letter A will be reserved for the alphabet. However, we do allow edges 
in both directions between two vertices, referred to as bidirectional edges (we shall abuse notations 
and identify a bidirectional edge with a corresponding undirected edge). In other words, the digraphs 
considered here are of the form D = (V, E), where E C V 2 . For any vertex v of D, its in-neighborhood 
is v~ = {u G V : (u,v) € E(D)} and its in-degree is the size of its in-neighborhood. By extension, we 
denote X~ = [j veX v ~ f° r an Y set of vertices X. We say that a digraph is strongly connected if there 
is a path from any vertex to any other vertex of the digraph. 
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The girth of a digraph is the minimum length of a cycle, where we consider a bidirectional edge as a 
cycle of length 2. A digraph is acyclic if it has no directed cycles. In this case, we can order the vertices 
v%, . . . , v n so that (vi,Vj) G E(D) only if i < j. The cardinality of a maximum induced acyclic subgraph 
of the digraph D is denoted as mias(D). A set of vertices X is a feedback vertex set if and only if any 
directed cycle of D intersects X, or equivalently if V\X induces an acyclic subgraph. The minimum 
size of a feedback vertex set of D is then equal to n — mias(D). 

Definition 1: |[T3l For any digraphs D\ and D 2 with disjoint vertex sets V\ and V 2 , we denote the 
disjoint union, unidirectional union, and bidirectional union of D\ and D 2 as the graphs on V\ U V 2 and 
edge sets 

E(D 1 U D 2 ) = E(Di) U E(D 2 ) 

E(D 1 UD 2 )=E(DiUD 2 )U{(vi,v 2 ) : v x eV u v 2 G V 2 } 
£(£>iQD 2 ) = E(DiGD 2 ) U {(v 2 ,vi) : G ^1,^2 G y 2 }- 

In other words, the disjoint union simply places the two graphs next to each other; the unidirectional 
union adds all possible arcs from D\ to D 2 only; the bidirectional union adds all possible arcs between 
D\ and D 2 . 

D. Guessing game and guessing number 

A configuration on a digraph D over a finite alphabet A is simply an n-tuple x = (x\, . . . , x n ) G A n . 
A protocol f = ... ,/n) on I) is a mapping between its configurations such that /(x) is locally 
defined, i.e. f v (x) = f v (x v -) for all v. The fixed configurations of / are all the configurations x G A n 
such that f(x) = x. The guessing number of D is then defined as the logarithm of the maximum number 
of configurations fixed by a protocol of D: 

g(D,A) =max{log| A | |Fix(/)|}. 

We now review how to convert a multiple unicast problem in network coding to a guessing game. Note 
that any network coding instance can be converted into a multiple unicast without any loss of generality 
||8l , ||9l . We suppose that each sink requests an element from an alphabet A from a corresponding source. 
This network coding instance is solvable over A if all the demands of the sinks can be satisfied at the 
same time. We assume the network instance is given in its circuit representation, where each vertex 
represents a distinct coding function and hence the same message flows every edge coming out of the 
same vertex. This circuit representation has r source nodes, r sink nodes, and m intermediate nodes. 
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(a) Network coding instance (b) Guessing game 

Fig. 1. The butterfly network as a guessing game. 



By merging each source with its corresponding sink node into one vertex, we form the digraph D on 
n = r + m vertices. In general, we have g(D, A) < r for all A and the original network coding instance 
is solvable over A if and only if g(D, A) = r 0. Note that the protocol on the digraph is equivalent to 
the coding and decoding functions on the original network. 

For any digraphs D\, D2 on disjoint vertex sets, we have g(DiUD2, A) = g(D\UD2, A) = g(D\,A) + 
g(D2,A) for all alphabets A. The guessing number of the bidirectional union is also bounded in |fl3l . 

We illustrate the conversion of a network coding instance to a guessing game for the famous butterfly 
network in Figure [TJ It is well-known that the butterfly network is solvable over all alphabets, and 
conversely it was shown that the clique K3 has guessing number 2 over any alphabet. The combinations 
and decoding operations on the network are equivalent to the protocol on the digraph. For instance, if 
V3 transmits the opposite of the sum of the two incoming messages modulo \A\ on the network, the 
corresponding protocol lets all nodes guess minus the sum modulo \A\ of their incoming elements. 

E. Parameters of undirected graphs 

An independent set in a (simple, undirected) graph is a set of vertices where any two vertices are 
non-adjacent. The independence number a(G) of an undirected graph G is the maximum cardinality of 
an independent set. We also denote the maximum degree and the clique and chromatic numbers of an 
undirected graph G as A(G), oj(G), x(G), respectively (see |[23l for definitions of these parameters). 
For a connected vertex-transitive graph which is neither an odd cycle nor a complete graph, we have |[23l 
Corollary 7.5.2] 

00(G) < < X (G) < A(G). (1) 
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The chromatic number and the independence number of a vertex-transitive graph are related by ll24ll 
(using the no-homomorphism lemma in 11251 ) 

x (G)<(l + loga(G))^M (2) 

a{G) 

We now review three types of products of graphs; all products of two graphs G\ and G 2 have V{G\) x 
V(G 2 ) as vertex set. We denote tow adjacent vertices u and v in a graph as u ~ v. 

• First, in the co-normal product G\ © G2, we have (1x1,1*2) ~ ( v i> v 2) if an d onr y if u i ~ u i or 
u 2 ~ u 2 - We have 

a(Gi © G 2 ) = a(G 1 )a{G 2 ). (3) 

• Second, in the lexicographic product (also called composition) G\ • G 2 , we have (tlx, 112) ~ («i, v 2 ) 
if and only if either u\ = «i and 112 ~ U2, or ui ~ vi. Although this product is not commutative, 
we have 

a(Gx ■ G 2 ) = a(G 1 )a(G 2 ). 

• Third, in the cartesian product G1OG2, we have (ni,n 2 ) ~ (^1,^2) if and only if either u\ = v\ 
and U2 ~ V2, or u 2 = V2 and ui ~ v\. We have 

x(G 1 DG 2 )=max{x(G 1 ),x(G 2 )}, 

a{G x UG 2 ) < min{a(Gi)|y(G 2 )|,a(G 2 )|F(Gi)|}. 

III. Coding functions and solvability 

A. Coding functions 

Definition 2: Let V be a finite set of n elements, a be a transformation of 2 V , and A, B be finite sets 
(A is referred to as the alphabet, \A\ > 2). A coding function for (a,A,B) is a tuple / = (/1,... ,/ n ) 
of n partitions of where each partition is in at most \A\ parts, such that f a (x) = for all X C V. 

The term coding function comes from the fact that /j should be viewed as the kernel of a function 
B — > ^4 for all i We shall typically view as a cartesian product of several copies of the alphabet A, 
but we do not need this restriction yet. 

We say that two transformations a and a' of 2 V are equivalent if the following holds. Any tuple of 
partitions / is a coding function of a if and only if it is a coding function for a'. 

Theorem 1: Let a be a transformation of 2 , then there exists a closure operator on V which is 
equivalent to a. 
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Proof: We take three steps. First, construct the digraph on 2 V with arcs (Y, a(Y)) for all Y C V. 
For any X QV, denote the connected component containing X as C(X). Then we claim that b(X) := 
UyeC(x) ^ i s equivalent to a (note that b is extensive). Indeed, if / is a coding function for a, then 
fx = f a (x) an d by induction it is easy to show that for any Y G C(X), fy = fx- Using the properties 
described above, we obtain that fux) = fx- Conversely, we have b(X) = b(a(X)) and hence if / is a 
coding function for b, then f x = f b (x) = fa(x) for all X. 

We now claim that c(X) = {J Y cx is equivalent to b (note that c is extensive and isotone). Indeed, 
if / is a coding function for b and Y C X, then f x refines fy = h(Y)- By the properties above, we 
obtain that fx refines f c (x) ! the converse is immediate and thus fx = f c (x) ■ Conversely, if / is a coding 
function for c, then fx = f c (x) refines fb(x) an d hence is equal to f^x) f° r ai l X. 

Finally, we claim that c\(X) = c n (Y) is equivalent to c (remark that cl is a closure). Indeed, if / is a 
coding function for c, then f x = f c (x) = ■ ■ ■ = fc-(x)- Conversely, f x = / c i(x) refines f c{x y ■ 

B. Closure solvability 

For the case of closure operators, we can restrict ourselves to the case where B = A r , where r is the 
rank of the closure operator. Indeed, if there exists a coding function / of partitions of B, where fy has 
k parts, then for any C of size k there exists a coding function g of partitions of C with gy = Eq- Note 
that for any coding function / and any basis b, /& = fy and hence fy has at most \A\ r parts. Therefore, 
we are only interested in coding functions defined on B = A r . We then say that a coding function is a 
solution if fy is the equality partition: fy = Eat- 

We now define the closure solvability problem. The instance is the ordered pair (cl, A), where cl a 
closure operator on V with rank r, and A a finite alphabet with \A\ > 2. The problem is to determine 
whether there exists an n-tuple / = (/i, . . . , f n ) of partitions of A r in at most \A\ parts such that 

fx = f c i ( x) forallXCT/ 

f V =E A r. 

For any partition g of A r , we define its entropy as 

H(g) :=-r+\A\- r ^2 \R t {g)\ log ]A] |P,( 5 )|. 

i 

This corresponds to the case where the input of the function is uniformly distributed. The equality partition 
on A r is the only partition with full entropy r. Denoting Hf(X) := H(f x ), we can recast the conditions 
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above as 

H f (X) = H f (cl(X)) forallXCF, 
H f (V) =r. 

We denote the maximum entropy of a coding function for cl over A as H(cl, A). Therefore, cl is 
solvable over A if and only if H(cl, A) = r. 

If cl is a matroid, the solvability problem is equivalent to determining whether they form a secret- 
sharing matroid, i.e. whether there exists a scheme whose access structure is the family of spanning 
sets of that matroid. We shall prove this in a more general setting in Section |Vj however let us give an 
informal proof here. Let / be a solution for a matroid clj\/. Let rk^ be the rank function associated 
to c\m, i.e. rkMpQ = min{|6| : c1m(6) = c\m{X)}. Then for any X, we have Hf(X) < rkjy(.X'). 
Moreover, there exists Y such that c\ M {X U Y) = V and rk M (Y) + rk M (X) = r, then H f (X) > 
H f (V) - Hf(Y) > rk M (X). Thus, H f (X) = xk M (X) for all X. 

In particular, a solution for the uniform matroid U r>n forms an (n, r, n — r + 1) |A|-ary MDS code. A 
solution for U2, n is then equivalent to n — 2 mutually orthogonal latin squares; it exists for all sufficient 
large alphabets. This illustrates the complexity of this problem: representing C/2,4 (i.e., determining the 
possible orders for two mutually orthogonal latin squares) was wrongly conjectured by Euler and solved 
in 1960 EH. 

Combinatorial representations |[T9l were recently introduced in order to capture some of the depen- 
dency relations amongst functions. A solution for the uniform matroid corresponds to a combinatorial 
representation of its family of bases; however, in general this is not true. Indeed, any family of bases has 
a combinatorial representation, while we shall exhibit closure operators which are not solvable. 

C. Simplifications 

So far, we consider any possible closure operator. The purpose of this section is to reduce the scope of 
our study by generalising some concepts arising from matroid theory. A vertex is a loop if it belongs to 
the closure of the empty set. We say that two vertices u, v are parallel if cl(u) = cl(v); clearly parallelism 
is an equivalence relation. We say that cl is simple if it has no loops and no parallel vertices. Any closure 
cl can be converted to a simple closure operator cl* by removing loops and only considering one element 
per parallel class. Lemma Q] below, whose proof is straightforward, shows that we only need to consider 
simple closures. 

Lemma 1: cl and cl* have the same rank and are solvable over the same alphabets. 
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There is a natural partial order on the family of closure operators of V. We denote cli < CI2 if for 
all X, cli(X) C ch(X). This is a partial order, with maximum element Uo, n (with cl(0) = V) and 
minimum element U n n (where c\(X) = X for all X). We have cli < CI2 if and only if the family of 
cli-closed sets contains the family of enclosed sets. 

Any tuple / = . . . , f n ) of partitions of A r into at most \A\ parts naturally yields a closure operator 
on V: we define 

clf(X) :={veV: f XUv = fx} 

= {v G V : H f (X Uv) = H f (X)}. 

Proposition 1: f is a coding function for cl if and only if cl < cl/. Therefore, if cli < CI2 have the 
same rank and CI2 is solvable over A, then cli is solvable over A. 

Proof: If / is a coding function for cl, then f c ux) = fxuv = fx for all v € cl(X) and hence cl < cl/. 
Conversely, if c\(X) C c\ f (X), then denote cl(X) = {vi,...,v k } and / cl(x) = f X u Vl V f V2 ,...,v k = 

fx V f V2 ,...,v k = ■■■ = fx- 

Since CI2 is solvable, there exists a coding function / for CI2 with entropy r, where r is the rank of 
cli and CI2. But then cli < CI2 < cl/ and hence / is also a solution for cli. ■ 

Any closure c having B as family of bases satisfies clg < c, where 



d B (X) 



V if3b£B:bCX 
X otherwise. 

Since clg < U r n , the uniform matroid with the same rank, cl^ is solvable. More generally, any closure 



operator of rank r satisfying cl(X) = X for any X with cardinality at most r — 1 is solvable. 

IV. Closure in digraphs and network coding 
A. Definition and basic properties 
Let D be a digraph on V. 

Definition 3: The D-closure of a set of vertices X is defined as follows. We let cd(X) = X U {v G 
V : v- C X} and the D-closure of X is cl D (X) := c£(X). 

This definition can be intuitively explained as follows. Suppose we assign a function to each vertex of 
D, which only depends on its in-neighbourhood (the function which decides which message the vertex 
will transmit). If we know the messages sent by the vertices of X, we also know the messages which 
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will be sent by any vertex in cd(X). By applying this iteratively, we can determine all messages sent 
by the vertices in c1d(X). 

We give an alternate definition of the closure below. 

Lemma 2: For any X C V, Y = c\d(X)\X is the largest set of vertices inducing an acyclic subgraph 
such that rcrul. 

Proof: First, it is clear that Y is a set of vertices inducing an acyclic subgraph such that Y~ C 
Y U X. Conversely, suppose Z induces an acyclic subgraph and Z~ C Z U X. Denoting Zq = and 
Zi = {v € Z : v- C X U for 1 < i < n, we have Zi C c* D (X)\X and hence Z = Z n C F. ■ 

Example 1: Some special classes of digraphs yield famous closure operators. 

1) If D is an acyclic digraph, then cId = L/o,n> i.e. ci£)(0) = V. This immediately follows from 
Lemma [2] 

2) If D is the directed cycle C n , then clc„ = U\^ n . Therefore, the solutions are (n, 1, n) MDS codes, 
such as the repetition code. 

3) If D is the clique K n , then clx„ = U n ~i >n . Therefore, the solutions of cl^- n are exactly (n, n — 1, 2) 
MDS codes, such as the single parity-check code. 

4) If D has a loop on each vertex, then c1_d = U n , n . 

Since cId(JT) = V if and only if X is a feedback vertex set of D, we obtain that clo has rank 
rr> = n — mias(D). We remark that not all closures can arise from digraphs. 

Lemma 3: The uniform matroid U r ^ n is the closure of a digraph if and only if r G {0, 1, n — 1, n}. 
Proof: The cases r = 0, l,n — l,n respectively have been illustrated in Example [TJ Conversely, 
suppose a digraph has closure U r ^ n , where 2 < r < n — 2. Then any set of n — r vertices induces 
an acyclic subgraph, while any set of n — r + 1 vertices induces a cycle. This implies that any set of 
n — r vertices induces a (directed) path. Without loss, let v\, . . . , v n - r induce a path (in that order), then 
Vi, . . . , v n - r , v n - r+ i induce a cycle, and so do v\, . . . , v n - r ,v n - r+ 2- Therefore, in the subgraph induced 
by V2, ■ ■ ■ , v n - r+ 2, the vertex v n - r has out-degree 2 and hence that graph is not a cycle. ■ 

B. Equivalence of closure solvability and network coding solvability 

We consider a multiple unicast instance: an acyclic network ./V with r sources si, . . . , s r , r destinations 
d\, . . . , d r , and m intermediate nodes, where each destination <i, requests the message x% sent by Sj. We 
assume that the messages Xi, along with everything carried on one link, is an element of an alphabet A. 
Also, any vertex transmits the same message on all its outgoing links. We denote the cumulative coding 
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functions at the nodes as / = (fa, . . . , f n ), where the first r indeces correspond to the destinations and 
the other m indeces to the intermediate nodes, and n = r + m. 

We remark that if the destination d, L is able to recover Xj from the messages it receives, it is also able 
to recover any function a(x{) of that message. Conversely, if it can recover n(xi) for some permutation 
7r of A, then it can recover Xj = 7r~ 1 (7r(xj)) as well. We can then relax the condition and let di request 
any such 7r(xj). Viewing Xj as a function from A r to A, sending (xi, . . . , x r ) to x%, we remark that 7r(xj) 
has the same kernel as Xj for any permutation it. Therefore, the correct relaxation is for di to request 
that the partition assigned to it be the same as that of the source Sj. 

The relaxation above is one argument to consider partitions instead of functions. The second main 
argument is that the dependency relations are completely (and elegantly) expressed in terms of partitions, 
as illustrated in the proof of Theorem |2] below. 

We now convert the network coding solvability problem into a closure solvability problem. Recall the 
digraph D on n vertices corresponding to the guessing game, reviewed in Section ITT] 

Theorem 2: f is a solution to the network coding instance if and only if clo has rank r and / is a 
solution to dp. 

Proof: Let us temporarily extend / to the sources as well. There are only three constraints for 
the coding functions: f v refines fa- for any v; f Si = fa. for all 1 < i < r; and fs = E^r for 
S = {si, . . . , s r }. The second condition indeed implies that we only need to consider / = (fa, ... , fa), 
while the third one is equivalent to fa = Ea^, where V correspond to the sources and intermediate 
nodes. 

First, remark that the first constraint equivalent to fa-uv = fa- f° r ai l v £ D. It is then easy to show 
that this is equivalent to / being a coding function for cp. Applying cd iteratively, (just like in the proof 
of Theorem [D we obtain that this is equivalent to / being a coding function of clo. 

Note that cId has rank r if and only if the original source-receiver pairs form a basis of D. Therefore, 
the third constraint is then equivalent to dp having rank r and / being a solution for it. ■ 

We remark that the closure approach differs from Riis's guessing game approach. Although it also gets 
rid of the source/intermediate node/receiver hierarchy and works on the same digraph, the distinction is 
in the fact that now / corresponds to the cumulated coding functions. 

V. Properties of Closure operators 

In this section, we investigate the properties of closure operators in general and we derive bounds on 
the entropy of their coding functions. 
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Fig. 2. Example where the inner rank is not monotonous. 

A. Inner and outer Ranks 

First of all, we are interested in upper bounds on the entropy of coding functions. 

Definition 4: The inner rank and outer rank of a subset X of vertices are respectively given by 

ir(X) := min{|6| : cl(X) = cl(6)} 

or(X) := min{|6| : X C cl(6)} = min{|6| : cl(X) C cl(6)}. 

Although the notations should reflect which closure operator is used in order to be rigorous, we shall 
usually omit this dependence for the sake of clarity. Instead, if the closure operator is "decorated" by 
subscripts or superscripts, then the corresponding parameters will be decorated in the same fashion. 

A set % with |i| = ir(X) and cl(i) = cl(X) is called an inner basis of X; similarly a set o with 
\o\ = or(X) and cl(X) C cl(o) is called an outer basis of X. 

The following properties are an easy exercise. 

Proposition 2: For any X, Y C V, 

1) or(cl(X)) = or(Jf) and ir(cl(X)) = ir(X); 

2) or(X) < ir(X) < \X\; 

3) or(X U Y) < or(X) + or(T) and ir(X U Y) < ir(X) + ir(F); 

4) or(0) = ir(0) = and or(V) = ir(V) = r; 

5) if X C Y, then or(X) < ot(Y). 

The closure of the empty set is the only closed set of (inner and outer) rank 0, while V is not necessarily 
the unique closed set of (inner or outer) rank r. 

Note that the inner rank is not monotonous, as seen in the example in Figure |2] We have cl(4) = V 
and hence ir(V) = 1, while ir(123) = 2. 

If cli(X) C cl2(X) for some X, then ori(X) > or2(X). Indeed, any outer basis of X with respect 
to cli is also an outer basis of X with respect to ch- In particular, if cli < cl2, then ori(X) > OT2(X) 
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for all X. 

Lemma 4: Let G : 2 V -+ R satisfying < G(X) < \X\ and G(cl(X)) = G(X). Then G(X) < ir(X) 
for all X. Also, iflCY implies G(X) < G(Y), then G(X) < or(X) for all X. 

Proof: First, if i is an inner basis of X, then G(X) = G(c\(i)) = G{i) < \i\ = 'it(X). Second, if o 
is an outer basis of X, G{X) < G(cl(o)) = G(o) < \o\ = or(X). ■ 

This Lemma proves that we get subadditivity for free. Since the entropy satisfies all the conditions of 
Lemma @] we obtain an upper bound on the entropy. 

Corollary 1: For any coding function / and any X C V, Hf(X) < or(X). 

B. Flats and span 

Before we move on to lower bounds on the entropy, we define two fundamental concepts. 
Definition 5: Aflat is a subset F of vertices for which there is no X D F with or(X) = or(F). 
cl(0) is the only flat with rank 0, and V is the only flat with rank r. 
Proposition 3: Flats satisfy the following properties. 

1) Any flat F is a closed set; 

2) or(F) = ir(F); 

3) for any X, there exists a flat F D X with or(F) = ov{X). 

Proof: Q] Since cl(F) contains F while having the same rank as F, it cannot properly contain F. 
[2] Let o be an outer basis of F. Since F C cl(o) while or(F) = or(cl(o)), we obtain F = cl(o) and 
o is an inner basis of F. 

[3] For any X, let C be a set with rank ot(X) and containing X of largest cardinality, then there exists 
no G such that C C G and or(G) = or(X) = or(C). ■ 

It is worth noting that there are closed sets which are not flats. For example, consider the following 
closure operator on {1, . . . ,n}, where cl(X) = {1, . . . ,max(X)}. Then it has rank 1 and hence two 
flats (the empty set and V), while it has n closed sets (cl(z) for all i). We shall clarify the relationship 
between closed sets and flats below. 

Definition 6: For any icy, the union of all flats containing X with outer rank equal to that of X 
is referred to as the span of X, i.e. 

span(X) := [j{F : Fflat,X C F,or(F) = or(X)}. 

Proposition 4: For any X, 

1) c\(X) C span(X) with equality if and only if c\(X) is a flat; 
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2) span(cl(X)) = span(X); 

3) span(X) := {v G V : oi(X Uv)= or(X)}. 

Proof: The first two properties follow directly from the definition. Suppose v G F, a flat containing 
X with or(F) = or(X), then oi(X U v) < or(F) = or(X). Conversely, if or(X Uw) = or(X), then 
JUw is contained in a flat with the same outer rank as X, and hence in span(X). ■ 

The flats and spans are fundamental concepts for closure operators, as illustrated by the surprising 
result below. 

Theorem 3: The following are equivalent: 

1) cl is a matroid; 

2) all closed sets are flats; 

3) all closed sets are spans. 

Proof: The first property clearly implies the third one. Let us now prove that the second property 
implies the first one. Let X C V, v G V and u G cl(X U v)\cl(X), then or(X U u) = or(X) + 1 = 
or(X U v), and hence c\{X Uu) = c\{X U v). Thus, cl satisfies the Saunders-Mac Lane exchange axiom. 

We now prove that the third property implies the second. Suppose all closed sets are spans, and consider 
a minimal closed set c of outer rank 1, i.e. or(c) = 1 and or(c') = for any closed set d C c. Since 
span(0) = cl(0), we must have c = span(c) and hence c is a flat. There can be no closed set of outer 
rank 1 properly containing c, therefore all closed sets of outer rank 1 are flats. Now consider any minimal 
closed set C2 of outer rank 2. It is equal to the span of some closed set X; then ov(X) = 2, by what 
we have previously shown, and hence X = C2 and C2 is again a flat. By induction on the outer rank, we 
prove that all closed sets are indeed flats. ■ 

We would like to explain the significance of flats in matroids for random network coding. A model for 
noncoherent random network coding based on matroids is proposed in |[27l , which generalises routing (a 
special case for the uniform matroid), linear network coding (the projective geometry) and affine network 
coding (the affine geometry). In order to combine the messages they receive, the intermediate nodes select 
a random element from the closure of the received messages. The model is based on matroids because all 
closed sets are flats, hence a new message is either in the closure of all the previously received messages 
(and is not informative), or it increases the outer rank (and is fully informative). 

There are solvable closures which are not matroids, e.g. the undirected graph C4 displayed in Figure 
[3] It is solvable because it has rank 2 and contains K2 U K2. In that case, note that the outer rank is 
submodular, and hence span^ = C/2,4 is a matroid; however, c\q a is not a matroid itself. 
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0— © 

Fig. 3. The graph C4 whose closure is solvable but not a matroid. 

C. Upper and lower ranks 

We are now interested in lower bounds on the entropy of coding functions. Since any closure operator 
has a trivial coding function with entropy zero (where the universal partition is placed on every vertex), 
the entropy of any coding function cannot be bounded below. Therefore, most of our bounds will apply 
to solutions only. 

Definition 7: The lower rank and upper rank of X are respectively defined as 

lrpO := min{|y| : c\(Y U (V\X)) = V}, 

w(X) :=r-h(V\X). 

A few elementary properties of the lower and upper ranks are listed below. 
Lemma 5: The following hold: 

1) lr(V) = ux(V) = r and lr(0) = ur(0) = 0. 

2) For any X C V, h(X) = if and only if c\{V\X) = V. Hence ur(X) = r if and only if 
c\(X) = V. 

3) For any X CV, 

ui(X) = r- min{or(y) : cl(X U Y) = V} 

= r- min{or(F) : F flat and c\(X U F) = V}; 

4) ur(X) = ur(cl(X)) and h(X) = lr(cl(X)); 

5) If X C Z, then ur(X) < m(Z) and h(X) < h(Z). 

6) h(X) < ur(X) < ar(X). 

Proof: All properties, except the last one, are easily proved. The inequality ur(X) < or(X) follows 
from the subadditivity of the outer rank. To prove that lr(X) < ur(X), let b be a basis for cl. Then 

v = d(b) = d{(b n x) u (b n (v\x))} c d{(b ni)u (v\x)} , 
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and hence cl{(6nX) U (V\X)} = V, thus \bC\X\ > lr(X). Similarly, \b n (V\X)\ > k(V\X), and 
hence r = \b\ > lr(X) + lr(V\X). ■ 
We remark that for any solution /, we have r = Hf(V) < or(X) + or(Y) for any X,Y such that 
cl(X U Y) = V. Therefore, we obtain 

H f (X) > ur(X) 

for all X C V. 

Corollary 2: For any solution / of cl and any X CV, 

r - H f (V\X) <r- ur(F\X) = lr(X) < r - h(V\X) = ur(X) < H f (X) < or(X). 

Note that a trivial lower bound on Hf(X) (where / is a solution) is given by r — H f{V\X). Therefore, 
the intermediate bounds on Hf(X) in Corollary |2] refine this trivial bound. 

Some of the results above can be generalised for any coding function /: denoting 

hf(X) = mm{H f (Y) : cl(Y U (V\X)) = V}, 
m f (X)=H f (V)-\T f (V\X), 

we obtain 

H f {V)-H f {V\X) < H f {V)-m f (V\X) = lr f (X) < H f {V)-h f (V\X) = m f (X) < H f {X) < ov(X). 

D. Inner and outer complemented sets 

Definition 8: We say a set X is outer complemented if or{X) = ur(X). Moreover, we say it is inner 
complemented if ir(X) = ur(X). 

Therefore, if X is outer complemented, then Hf(X) = or(X) = m(X) for any solution /. 
Remark that X is outer (inner) complemented if and only if cl(X) is outer (inner) complemented. 
Proposition 5: The following are equivalent: 

1) X is outer complemented; 

2) there exists Z such that or(X) + or(Z) = r, cl(X U Z) = V and X n Z = 0; 

3) any outer basis of X is contained in a basis of V . 

Similar results hold for inner complemented sets. The following are equivalent: 

1) X is inner complemented; 

2) X is outer complemented and ir(X) = ox{X)\ 

3) any inner basis of X is contained in a basis of V. 



November 15, 2012 



DRAFT 



19 




Fig. 4. The graph C5 whose closure is outer complemented and not solvable. 



Proof: The equivalence of the first two properties is easily shown. If X is outer complemented, let 
o be an outer basis of X and let Z satisfy cl(X U Z) = V and \Z\ = r — oi(X). Then o U Z is a basis 
of V. Conversely, if any outer basis can be extended to a basis, then any such extension is a valid Z for 
Property |2l 

The properties for an inner complemented set are easy to prove. ■ 

We saw earlier that cl(X) C clf(X) for any coding function / and any X. This can be refined when 
/ is a solution and X is outer complemented. 

Proposition 6: If / is a solution of cl then cl(span(X)) C clf(X) for any outer complemented X. 
Proof: For any outer complemented X, we have Hf(X) = oi(X). Suppose v G span(X), then 
or(X) = or(X U v) > H f (X U v) > Hf(X) = or(X) and hence v G cl/(X). Since cl/(X) is a closed 
set of cl, we easily obtain that cl(span(X)) C cl/(X). ■ 

Corollary 3: If there exists an outer complemented set X such that its span has higher outer rank and 
is also outer complemented, then cl is not solvable over any alphabet. 

By extension, we say that cl is outer complemented if all sets are outer complemented. 

Theorem 4: Suppose that cl has rank r and is outer complemented. Then cl is solvable if and only if 
span is a solvable matroid with rank r. 

Proof: If all sets are outer complemented, then any solution / of cl is also a coding function of 
span since span(X) = {v G V : Hf(X U v) = Hf(X)}. Since the outer rank is equal to the entropy 
Hf, it is submodular and hence span is a matroid whose rank function is given by the outer rank. Thus 
span has rank r and / is a solution for it. 

Conversely, if span is a solvable matroid with rank r, then we have cl < span and by Lemma [T] cl 
is solvable. ■ 
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For instance, for the undirected cycle C5 in Figure HJ c\q is outer complemented, though the outer 
rank is not submodular, hence span is not a matroid. As such, C5 is not solvable (its entropy is actually 
2.5 0). 

We would like to emphasize that if all sets are outer complemented, then the outer rank must be 
submodular, i.e. the rank function of a matroid. However, this does not imply that cl should be a matroid 
itself. For instance, consider cl defined on {1,2,3} as follows: cl(l) = 12, cl(2) = 2, cl(3) = 3, 
cl(13) = cl(23) = 123. Then any set is inner complemented, cl is solvable (by letting f\ = f 2 and 
such that /1 V /3 = £U 2 ) but cl is not a matroid. 

E. Subcomplemented sets 

We now refine some of these results by using submodularity. The results in this subsection are not 
as elegant as those given earlier. Therefore, we shall not try to refine them any further by using more 
(linear or nonlinear) information inequalities. 

Definition 9: The alpha rank of X is defined as follows. Let ao(X) = ot(X) and recursively 

ai(X) = min{ai_i(5) +a^i(T) -r : cl(S U T) = V,cl(SnT) = cl(X)}. 

Finally, let a(X) = a r2 n(X). 

The beta rank of X is defined as follows. Let fHa{X) = and for any 1 < i < r2 n , let 

Pi(X) = max{r - a(Y) + n Y) : c\{X U Y) = V}. 

Finally, denote [3{X) := /3 r2 n(X). 

The OjS form a sequence of upper bounds on the entropy Hf(X), when / is a solution. Conversely, the 
/3jS form a sequence of lower bounds on Hf(X) for any coding function /. Remark that fti(X) > ur(X). 

Proposition 7: The alpha and beta rank satisfy the following properties: for any X, Z C V, 

1) a(X) = a(cl(X)) and f3{X) = (3(cl(X)); 

2) X C Z implies a(X) < a(Z) and (3(X) < p(Z); 

3) ur(X) < P(X) and a{X) < or(X); 

4) for any solution /, we have f3{X) < H f (X) < a(X). 

Proof: We give the proofs for the alpha rank, since the proofs for the beta rank are similar. The 
first property is trivial, while the second and third are easily proved by induction on i; we have ctj(X) < 
aii-i(X) and oti(X) < ai(Z) for all 1 < i < r2 n . By induction on i, we can also prove that Ht(X) < 
ai(X) as follows. For i = 1, we have H f (X) < H f {S) + H f (T) - r for any S,T C V such that 
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cl(S U T) = V and cl(S n T) = c\(X). Therefore, H f (X) < ai(X). The induction step is entirely 
similar. ■ 
The alpha and beta rank of a solvable closure must satisfy 

ur(X UZ) + uv(X HZ)< ${X U Z) + p(X n Z) < a{X) + a{Z) < or(X) + ar(Z) 

for all X,Z QV. 

Definition 10: A set X is subcomplemented if a(X) = f3(X). 

Therefore, if X is subcomplemented, then Hf(X) = a(X) for any solution /. We can mimic the 
proof of Theorem [4] to refine it thus. We define the alpha-span of X as 

aspan(X) = {v eV : a(X U v) = a(X)}. 

Theorem 5: Suppose that all sets are subcomplemented with respect to cl, a closure operator with rank 
r. Then cl is solvable if and only if aspan is a solvable matroid with rank r. 

VI. Combining closure operators 

In this section, we let S, T C V such that S n T = and S U T = V. For any X C V, we denote 

X s = X n S and X T = X n T. 

A. Disjoint and unidirectional unions 

We first generalise some definitions from matroid theory. 

Definition 11: For any closure cl and any T QV, the deletion ofT and the contraction ofT from cl 
are the closures defined on S by 

cl\\ T (X) := cl(X)\T 
c\\ /T {X) := cl(XU T)\T 

for any ICS. 

Note that cl|\ T < cl| /T . We have r(cl|/ T ) = lr(5) and r(cl|\ T ) > ir(5) with equality if S is closed. 
An example when the inequality is strict is given in the three-vertex path P3, and S = 13, as displayed 
in Figure [5] 

We now determine the lower and upper ranks for closure operators associated to digraphs. 
Proposition 8: If cId is the closure associated to the digraph D, then for any S Q D, c\ D y s \ = c\d\^ t , 
where D[S] is the digraph induced by the vertices in S. Thus lr(5) = \S\ — mias(S') for any S. 
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Fig. 5. Example where r(cl|\ T ) > ir(S'), where S = 13. 

Proof: Let X C S, then any subset Y of S\X = V\(X U T) induces an acyclic subgraph of 
D if and only if it induces an acyclic subgraph of D[S]; moreover, Y~ C X U Y in D[S] if and 
only if y- CIUFUT. By Lemma EJ we obtain cl D[S ]{X)\X = cl D (X U T)\(X U T) and hence 

ci D \/ T (x) = ci D[s] (x). m 

In particular, we obtain HAX) > \X\ — mias(X) for all X and any solution /. Therefore, any solution 
must be "at least as good" as letting any subset of players play independently. 

Definition 12: For two closures cli and ct2 defined on S and T respectively, we define the disjoint 
and unidirectional unions of these closures on V = S U T respectively as 

cli U cl 2 (X) := ch(Xs) U cl 2 (X T ) 

\sUch(X T ) ifcli(X s ) = 5 
cliUcbCX) := I 

I cli(Xs) U Xt otherwise 

For instance, any closure can be decomposed into cl = cl j / cl ( ) u C^o,|ci(0)|- Moreover, if there is a 
loop on vertex v in the digraph D, then cl£>(X) = c\d(X\v) U (X D v), or in other words, do = 
dD[v\v] U Ui t i = c\ D [v\ v ]UUi t i. 

We remark that if cli and CI2 are matroids, then cli U d.2 = cliUcl2, which is commonly referred to 
as the direct sum of cli and CI2 II2TI . 

Recall the definitions of unions of digraphs in Section ITT] Our definitions were then tailored such that 

c1d!UD 2 = cIdj U ci£) 2 

d D!UD 2 = cl Dl Gcl D2 . 

For any cli, CI2 we have cliDc^ < cli U C12 and 

r(cli U cl 2 ) = r(cliUcl 2 ) = r\ + r 2 . 

The disjoint and unidirectional unions are related to the contraction as follows. 
Proposition 9: For any cl and any Tcy, the following are equivalent 

1) cl|/ T = cllyr; 
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cl\\ T Ocl\/ s (X) 



2) cl|/ T Ucl|/ 5 < cl < cl|/ T Ucl|/ s ; 

3) There exist cli, ch defined on S and T respectively such that 

cliUcl 2 < cl < cli Ucl 2 . 

Proof: The first property implies the second, due to the following pair of inequalities: For any S, 

cl|\ T Ucl| /5 < cl < cl| /T Ucl[ /s . 
To prove the first inequality, we have 

f 

S U cl\/ s (X T ) = cl(X T US) if S C cl(X s ) 
(cl(X s ) n S) U X T C cl(X) otherwise. 
For the second inequality, we have 

cl(X)\T = d(X s U X T )\T C cl(X s U T)\T = cl\/ T (X s ), 

and similarly c\(X)\S C cl|/ s (X T ), and hence cl(X) C cl|/ T U cl|/ s (X). 

Clearly, the second property implies the third one. Finally, if there exist such cli and cl2, then it is 
easy to check that cli = cl|yr = cl|/ T . ■ 

Definition 13: We say that cl is disconnected if there exists T such that clj/ T = cl|\ r ; it is connected 
otherwise. 

The closure of a non strongly connected graph is disconnected. However, there are strongly connected 
graphs whose closure is disconnected, for instance if there is a loop on a vertex, or in the graph in Figure 
H where cl|\ T = c\\/ T for T = 45. We can determine conditions for the closure of a digraph to be 
connected. However, characterising which graphs have connected closures, whether via an elementary 
property or by a polynomial algorithm, remains a completely open problem. We say a cycle v \ , . . . , Vk is 
minimal if there does not exist i,j E {1, . . . , k}, (i, j) ^ (1, k), such that V{, . . . , Vj is a cycle. In other 
words, a minimal cycle does not cover another shorter cycle. 

Proposition 10: Suppose D is strongly connected, but &d\\t = c\d\^ T ■ Then T is acyclic and all 
minimal cycles lie in S. 

Proof: We first claim that all arcs from T to S come from clrj[T}{®)- Indeed, let u G S such that 
u ~ n T ^ 0. Then u G cl D \/ T (S\u) = cl D \\ T (S\u), and hence u G c\ D (S\u). Since vT C c\ D (S\u), 
the intersection X := clrj(S\u) PI T is not empty. By Lemma [2j X induces an acyclic subgraph and 
X~ CSnl, which is equivalent to X C cl D[T ](0). 
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Fig. 6. A graph which is strongly connected but whose closure is disconnected. 

Now, suppose T is not acyclic, i.e. T ^ c\d[t]($). But then, by the claim above there are no arcs from 
T\cl£)[T](0) to its complement, and D is not strongly connected. 

We now prove that any minimal cycle lies in S. First of all, if X induces a minimal cycle, then it 
cannot entirely lie in T. Suppose X does not lie entirely in S either. Since X$ is acyclic, we have 
X s C cl|/ T (y) C cl(y), where Y = S\X S . Therefore, X T C X~ C cl(Y); gathering, we obtain 
X C cl(y). More precisely, X C cl(Y)\Y and hence X is acyclic, which is a contradiction. ■ 

As a corollary, if D is an undirected graph, then clo is connected if and only if D is connected. 



B. Bidirectional union 

Definition 14: The bidirectional union of cli and cl2 is defined as 

5Ucl 2 (X T ) \IX S = S 
cliOcl 2 (X) := { c \(Xs) U T if X T = T 
Xs U Xt otherwise. 

It is easily shown that cliOc^ < cliOc^ and r(cli0cl2) = min{ri + n2,r 2 + n{\ for any closure 
operators cli and cl 2 defined on disjoint sets. Moreover, for any cl and any S C V, we have 

cl| /T 0cl|/ 5 <cl<cl|/ T Ucl|/ 5 . 

The bidirectional union of digraphs does correspond to the bidirectional union of closures: 

cIdjODs = c1 Di 0c1d 2 , 

and the converse is given below. 

Lemma 6: If cl = cliOcl 2 , defined on S and T, respectively then cli = cl|/ T and cl 2 = cl|/ 5 . 
Moreover, if D is a loopless graph, then cId is the bidirectional union of two closures if and only if D 
is the bidirected union of two graphs. 
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Proof: The first claim is easy to prove. For the second claim, if cId = elided, then c\e> = 
c1£)[s]0c1d[t] and hence it is the bidirectional union of two digraph closures. Suppose the arc (u,v) 
is missing between S and T. Then cl£>(V\{u, v }) = V (since v~ C V\{u,v} and u~ C V\u), while 
cl|/ r Ocl|/ s (y\{n, v}) = V\{u, v}. M 

VII. Solvability graph 

A. Definition and main results 

The solvability graph extends the definition of the so-called guessing graph to all closures. Most of 
this section naturally extends fT3l . Therefore, we shall omit certain proofs which are very similar to their 
counterparts in lfT3l . 

To any coding function / we denote f(x) as the image of some function A r — > A n whose kernel is 
/. To be rigorous, we should make our choice of function explicit; however this amount of rigour shall 
not be necessary. 

Definition 15: The solvability graph G(cl, A) has vertex set A n and two words x, y G A n are adjacent 
if and only if there exists no coding function / such that x, y € f(A r ). 

Proposition [TT] below enumerates some properties of the solvability graph. In particular, Property [2] 
provides a concrete and elementary description of the edge set which makes adjacency between two 
configurations easily decidable. 

Proposition 11: The solvability graph G(cl, A) satisfies the following properties: 

1) It has \A\ n vertices. 

2) Its edge set is E = Uscv,i,eci(s) E %S, where E Vj s = {xy : x s = ys,x v / y v }. 

3) It is a Cayley graph. 

Property [2] confirms that the definition of the solvability graph does not depend on the choice of 
function to define f(A r ). 

The degree of any vertex in the solvability graph, which can be determined for the case of closure 
operators defined on digraphs, cannot be easily given in general. 

The main reason to study the solvability graph is given in Theorem [6] below. 

Theorem 6: A set of words in A n is an independent set of the solvability graph if and only if they 
are the image of A r by a coding function for the closure operator. 

Proof: Let / be a coding function for cl and suppose x = f(x') and y = f(y') are adjacent in G. 
Then x v ^ y v and xs = ys for some v G cl(5), and hence x' and y' are in the same part of fs but in 
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different parts of /„ (and hence of fsuv)- Thus fs is strictly refined by fsuv, and hence by f c us)' which 
contradicts the fact that fs = f c \(S)- 

Conversely, let {x a } t =1 be an independent set of G and let /j be described as follows. Denote the 
elements of A r as w 1 , . . . ,w' A ^\ then f v (w a ) = x% if a < k and f v (w a ) = A otherwise, where A is 
chosen so that there exists no x a with x% = A. We now prove that / is indeed a coding function. We 
only need to show that for any S and v € cl(5)\5, we have fs = fsuv Take w a ,w b € A r and denote 
f{w a ) = y a and f(w b ) = y b . If y s ^ y b s , they are in different parts of fs and hence in different parts 
of fsuv Otherwise, they are in the same part of fs. Note that either they are both in the image of / or 
none of them is. In either case, we have y% = y b and they are in the same part of /„ as well (and hence 

of fsuv)- ■ 
Corollary 4: We have logi^i a(G) = //(cl, A) and hence a(G) = r if and only if (cl, A) is solvable. 
In ||T3l . we remark that the index coding problem asks for the chromatic number of the guessing graph 

of a digraph. We can extend the index coding problem to any closure operator and we say that cl is 

index-solvable over A if /(cl, A) := logi A i x(G(cl, A)) = n - r. We have H(cl, A) + /(cl, A) > n by [TJ 

Furthermore, asymptotically, we have 

lim i/(cl, A) + lim /(cl, A) = n 

\A\— >oo \A\-*oo 

by ©. Therefore, although determining //(cl, A) and /(cl, A) are distinct over a fixed alphabet A, they 
are asymptotically equivalent. More strikingly, solvability and index-solvability are equivalent for finite 
alphabets too, as seen below. 

Theorem 7: The closure operator cl is solvable over A if and only if it is index-solvable over A. 
Proof: Let {x 1 } be an independent set of G and b be a basis of cl. Without loss, let b = {1, ... , r}). 
First, we remark that x\ ^ x 3 h for all i ^ j, for otherwise x l v \ b ^ an d x\ = x 3 b means that x l ~ xK 
Secondly, let A = Z^, then for any w G A n ~ r and any i, denote x % + w = (%liXy\ b + w). Then it is 
easily shown that S w = {x l + w} forms an independent set and that the family {S w } forms a partition 
of A n into |j4| n-r independent sets. 

Conversely, if X (G(cl,A)) = \A\ n ~ r , then a(G(cl,A)) = \A\ r by Q. ■ 

S. Neighbourhood and girth 

Note that the relation "having an arc from u to v" cannot be expressed in terms of the digraph closure. 
Indeed, all acyclic graphs on n vertices, from the empty graph to an acyclic tournament, all have the 
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same closure operator Uo tU . However, the closure of the in-neighbourhood of a vertex can be described 
by means of the digraph closure. 

Lemma 7: For any v and any X C V\{v}, v € cId(X) if and only if c1d(v~) C c1e>(X). 

Proof: Suppose v £ c\d(X)\X, then Y = cId(X)\X induces an acyclic subgraph and Y~ C 
cId(X); in particular, v~ C clrj(X). Since v € cId(v~), we easily obtain the converse. ■ 

We remark that if there is a loop on v, then there exists no set X C V\{v } such that v G cln(X). 
Note that v~ is not necessarily an inner basis of its own closure, for instance this is trivial in nonempty 
acyclic digraphs. 

Based on our results about closures associated to digraphs, we can define some concepts to any closures 
which generalise those of digraphs. 

Definition 16: For any vertex v, the degree of v is 

d v := mm{\X\ : v € c\{X)\X) 

if there exists such set X, or by convention is equal to otherwise. We denote the minimum degree as 
5. 

Note that the degree (according to the closure clr>) of a vertex of the digraph D is not necessarily 
equal to the size of its in-neighbourhood. 

Definition 17: We say a subset X of vertices is acyclic if lr(X) = 0. The girth 7 of the closure as 
the minimum size of a non-acyclic subset of vertices. 

Here, the girth of a digraph is equal to the girth of its closure. 

We denote the maximum cardinality of a code over A of length n and minimum distance d as M(n,d). 
Proposition 12: For any cl, we have 

log ^1 M(n, n - 5 + 1) < (cl, A) < log ]Al M (n, 7). 

Since 5 < r and 7 < n — r + 1, we have j = n — 5 + 1 if and only if cl = \J r . n . 

C. Combining digraphs 

Theorem 8: For any cli and CI2 defined on disjoint sets S and T of cardinalities n\ and 77,2, we have 

G(cli U cl 2 , A) = G{ch,A) e G(cl 2 , A) 
G(cliUcl 2 , A) = G{ch,A) ■ G(cl 2 , A) 
G(cliClcl 2 ,A) = G(ch,A)nG(cl 2 ,A). 
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Fig. 7. The bidirectional union .E3UC5. The vertices of Cg form a basis; the highlighted disjoint cliques 127, 248, 56 show 
that it is solvable. 

Therefore, 

£T(cli U cl 2 , A) = ff(cli0cl 2j A) = H(ch,A) + H{c\ 2 ,A) 
#(cliOcl 2 , A) < min{i?(cli, A) + n 2 , H(c\ 2 ,A) + n x }. 

Corollary 5: The following are equivalent: 

• cli and cl 2 are solvable 

• cli U cl 2 is solvable 

• cliUcl 2 is solvable. 

Moreover, suppose m — r\ > n 2 — r 2 , then cliUcl 2 is solvable if and only if cli is solvable and 
-ff(cl 2 ) > n 2 — n\ + T\. In particular, if both cli and cl 2 are solvable, then so is cliOcl 2 . 

Therefore, when studying solvability, we can only consider connected closures. Moreover, the solv- 
ability of a bidirectional union reduces to entropy problems in the two constituent parts. An example 
where cl 2 is not solvable, yet cliUcl 2 is solvable, is given in Figure |7J 

D. Combining alphabets 

Let [k] = {1, . . . , k} for any positive integer k. We define a closure on V x [k] as follows. For any 
v G V, let [v] = {(«, i) :ie [k]} and for any X C V x [k], denote X v = {v G V : [v] C X}. Then 

cl [k] (X) :=XU{[v] :vecl(X v )}. 
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This closure can be intuitively explained as follows. Consider the solvability problem of cl over the 
alphabet A k . Each element of A k is as a vector of length k over A, then cl^ associates k according 
vertices [v] to each v G V, each new vertex (v, i) corresponding to the coordinate i. If v G cl(l") for 
some Y C V, then the local function f v depends on fy. We can view /„ : A kr — > A k (and hence all 
its coordinate functions) as depending on all coordinates of all vertices in Y, hence the definition of the 
closure. 

In particular, for D construct as follows: its vertex set is V x [k] and its edge set is {((it, i), (v,j)) : 
(u,v) G E(D)}. Then it is easy to check that cl^ = cIqm. 
Proposition 13: We have the following properties: 

1) r (cl [k] ) = kr (cl). 

2) G{cl [k] ,A) ^ G(cl,A k ) and hence H(cP ] ,A) = kH(cl,A k ). 

3) If cl is simple, then for all k, cV- k ' is simple too. 

4) If cl is connected, then so is clW for all k. 

Proof: The proof of the first two claims is similar to that of 11131 Proposition 10], while the third 
claim is easily proved. 

We now prove the last claim. For any S C V x [k], we denote L<S*J = U«es H> ^ = x [^lAS 1 ' 
and \T] = TU (S\[S\) = (V x [fc])\L5j. Note that SV = [S\ v = V\\T] V . Then we claim that if 
cll\ T = dW\/ T , then clW|\rrl = c\W\/^\ For any Y C L^J , let X = Y U (S\|SJ); then X v = Y v 
and X U T = Y U \T] . We then have 

{H : u G cl(Yy)} n 5 = {[u] : v G cl(X v )} n 5 

= {[v] : t> G cl((XUT) v )}nS 
= {[v]:vecl((YUT) v )}nS, 

and in particular, then intersections with [S\ are equal, thus proving the claim. 

Now suppose is disconnected, then cl[ fc V = cl[ fe ]|/ T for some T = |~T] (and hence S 1 = [<5J 
and V = S v U Ty). Then for any ICS, {XUT) V = X V UT V and we have 

{[«] : w G cl(Xy) n 5y} = {[«] : u G cl(Xy)} n S 

= {[v] :v £cl(X v UT v )}nS 

= {[v] : v G cl(Xy U Ty) fl S V }, 

and hence cl|\ Tv (X v ) = cl\/ Tv (X v ) for all X v C 5y. ■ 
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VIII. Closures of rank two 

In this section, we prove that all digraphs with rank 2 are solvable, but that there exist closures of 
rank 2 which are not solvable. The property which makes digraphs special is defined below. 

Definition 18: The closure operator cl is separable if for any closed set of outer rank 1 is contained 
in a unique flat of rank 1. 

We can refer to the family of flats of rank 1 as {cl(v') : v' G V'} for some V' C V. If cl is separable, 
then the sets {cl(V)\cl(0) : v' G V'} form a partition of F\cl(0). Note that a simple matroid may not 
be separable, as we shall see in Example [2] below. 

Lemma 8: For any D, cId is separable. 

Proof: We prove that for any a, b G V such that a c1d(6), b £ clo(a), we have clo(a) H c1d(6) = 
c1d(0). Recall that for any v G V and any X C V, v G c1d(X) if and only if v G X or v~ C c1d(X). 
Now consider a, 6 G V such that a ^ c1d(6) and 6 ^ c1d(o,). For any w G C£>(a), either w~ = whence 
w G cl£>(0) or = a and hence u; ^ c1d(6). Similarly, for any u G c|,(a), u~ is not empty and 
contained in co(a), hence u £ c1d(6). By induction, we obtain that ci£>(a) n clu(b) = c1d(0). 

Therefore, for any v G V, the closed sets of outer rank 1 containing cl(v) must form a chain: cl(u) C 
. . . C cl(u'), where cl(u') is the unique flat of rank 1 containing el(-u). ■ 

Theorem 9: Let cl be a separable closure operator of rank 2. Then it is solvable over all sufficiently 
large alphabets. 

Proof: For all sufficiently large alphabets, we can find \ V'\ mutually orthogonal latin squares; simply 
assign the same latin square to all elements of the same part of the partition illustrated above. In cl(0), 
simply assign the universal partition. This clearly has entropy 2; checking that it is a coding function for 
cl is easy. ■ 

Corollary 6: If n — mias(-D) = 2, then cl/) is solvable over all sufficiently large alphabets. 

However, as soon as the closure is not separable, everything collapses, as seen from the example below. 

Example 2: Let cl be the closure operator defined on {1,2,3,4} as follows: cl(0) = 0, cl(l) = 12, 
cl(2) = 2, cl(3) = 23, cl(4) = 4, cl(13) = cl(24) = 1234. Then cl has rank 2 but is not separable, since 
2 is contained in two flats of rank 1: 12 and 23. 

Then we claim that this closure has entropy 1.5, which is achieved over all alphabets of square 
cardinality. 
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First, let us give an upper bound on the entropy. We have 

H(V) = #(123) < #(12) + #(23) - #(2) < 2 - #(2) 
H(V) = #(24) < #(2) + #(4) < 1 + #(2), 

whence H(V) < 1.5. 

Second, suppose A = B 2 and denote any element of A 2 = B A as b = (61,62,63,64). For any 
1 < k < 4, let Eji be the partition into \B\ parts according to the k-th coordinate: Pj(-Efc) = {b : bj, = i}. 
Let fi = E 1>2 , h = E\, / 3 = E 1:3 , f 4 = E 2>3 , then / is a coding function for cl, and H f (V) = 1.5. 

The technique used in the example above can be generalised to produce "arbitrarily tight bottlenecks" 
for closure operators. 

Theorem 10: For any r and any e, there exists a closure operator of rank r and maximum entropy less 
than 1 + e. 

Proof: We construct the closure operator as follows. Let k be an integer satisfying 

k log(l + e) - log 6 _ 1 
logr — log(r — 1) 

Consider r sets Ax, . . . , A r of cardinality r each. For any i, we consider the set of words over the alphabet 
Ai of length at most k: V = U[=i Uf=o^i- We denote the empty word over Ai as 5i, i.e. A® = {5i}. 
We also define a partial order on V, where u < v if and only if u is the initial segment of v. Note that 
Si < v if and only if v is a word over A; L . For any X C V, we denote the largest size of a family of 
elements pairwise incomparable as s(X). The closure of a set of words X is given by 



cl(X) ■, 



V if s(X) > r 

{J ve xi u £V'-u<v} otherwise. 
We now give an upper bound on the maximum entropy of V. We shall denote Si := YlweA[ H{w). 
Suppose A x = {1, . . . , r}; let Vi = (1, . . . , 1, i) G A\ for all 1 < i < r and v' = (1, . . . , 1) G A^ 1 
be their common initial segment. The ViS are all pairwise incomparable, thus they generate V. Since 
cl(vi) fl cl(vj) = cl(v r ) for all i ^ j, we have 

r 

H(V) = H(v 1: ...,v r )<J2 H(vi) - (r - l)H(v>) 

i=l 

by repeatedly applying the submodular inequality. Performing this operation for any value of the initial 
segment yields 

r k - l H{V)<S k -{r-l)S k -x- 
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This method was performed on words of length k, but it can be easily adapted to any length between 1 
and k — 1, so that 

r k ~ 2 H(V) < S fe _! - (r - l)S fc _ 2 
H{V) <5i-(r-l)fT(<Ji). 

Summing up, we obtain 

Ai?(y) < r fc - (r - l) fe F(<5i), 

where 

A = r^" 1 + (r - l)r fc " 2 + . . . + (r - l)*" 1 = r k - (r - if. 

Summing up for all AiS, and finally using the fact that the <5jS generate V, i.e. H(V) < Yli=i H(Si), we 
obtain 

(rA + (r - l) k )H{V) <r k+1 . 



An easy manipulation finally leads to 

(r _ 
r k+i _ ( r _ X ) 



^)^ 1 + i^r37^TWT <1 + £ - 
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